Expansion of the field of informetrics: Origins and consequences

نویسنده

  • Leo Egghe
چکیده

This editorial introductory paper first discusses the reasons for the clear growth of the field of informetrics (bibliometrics, scientometrics, webometrics, ...). This has lead some journals to increase their number of volumes or the number of issues per volume. The journal Information Processing and Management decided to devote two special issues (the one here and another one to come in 2006) to the broad topic "Informetrics" where the scope of these special issues is to attract good papers dealing with gathering important data sets and/or presenting original models and explanations. Then we briefly discuss the content of the papers that are published in this special issue. They are dealing with models, mapping of science (cocitation, coword analysis), web sites and search engines, collaboration in digital libraries and the newest topic in informetrics: use of and access to articles in digital libraries. I. THE GROWTH OF THE FIELD OF INFORMETRICS In this introductory paper, we will use the term "informetrics" as the broad term comprising all -metrics studies related to information science, including bibliometrics (bibliographies, libraries, ...), scientometrics (science policy, citation analysis, research evaluation, ...), webometrics (metrics of the web, the Internet or other social networks such as citation or collaboration networks), ... . The term informetrics was introduced by Blackert and Siegel (1979) and by Nacke (1979) but gained popularity e.g. by the organization of the international informetrics conferences in 1987 (see Egghe and Rousseau (1988, 1990)). However the field "informetrics" (not the name) started already in the first half of the twentieth century e.g. by the works of Lotka, Bradford and Zipf (see Lotka (1926), Bradford (1934), Zipf 1949, but for the law of Zipf, see also Condon (1928) or even Estoup (1916)). The term bibliometrics was coined in Pritchard (1969) and the term scientometrics was coined in Nalimov and Mul'čenko (1969) in Russian: naukometrija. For more on the history of these and other terms see White and McCain (1989), Ikpaahindi (1985), Lawani (1981), Tague-Sutcliffe (1994), Brookes (1990), Wilson (1999), Egghe and Rousseau (1990) and Egghe (2005). That the field of informetrics has grown in the twentieth century is evident but this growth has become more and more clear the last decades. Lipetz (1999) describes an exponential growth of JASIS now called JASIST (Journal of the American Society for Information Science and Technology, existing 50 years in 1999) in terms of number of papers and in terms of number of authors and even in terms of average number of references per paper. One also shows in Lipetz (1999) that the average number of authors per paper is increasing. Authors are also responsable for a multidisciplinary growth of the field of informetrics see Summers, Oppenheim, Meadows, McKnight and Kinnell (1999) hereby also indicating the influence of informetrics to other scientific disciplines. Multidisciplinarity is evident if one looks at the "new" topics which informetrics is covering: the metrics of the web, Internet, intranets and other social networks such as citation or collaboration networks. In general one can say that the creation of the "information society" is responsable for the growth of the field of informetrics. So we can say that the field of informetrics nowadays comprises the fastly growing field of webometrics (see Hood and Wilson (2001)) (netometrics, as introduced in Bossy (1995) would be a better term covering also non-web activities but the term does not seem to become popular see Hood and Wilson (2001)). Cybermetrics also exists (it is even the name of an electronic journal under the editorial direction of I. Aguillo) but it is not clear whether it will overtake, some day, the term webometrics. Schubert (2002) describes 50 volumes of the journal Scientometrics and also concludes the increase of the number of authors and the fact that they more and more collaborate in the sense that the average number of authors per paper increases (same conclusions as in Lipetz (1999)). Schubert also remarks that there is no evidence that the degree of "hardness" of the field informetrics is increasing, a point to keep in mind for the future evolution of this field. He and Spink (2002) describe foreign authorship in JASIST and JDOC (Journal of Documentation) and prove that their share in these journals becomes larger and larger indicating an increase of internationalization of the field of informetrics. The latter is also illustrated in Bar-Ilan (2000) where one makes the constatation that the articles in the Proceedings of the international informetrics conferences are increasingly cited. The extension of information science to networks and the information society in general has the consequence that more and more data are gathered in an automatic way. This implies that data can be gathered in a much faster way than it used to be but also that the accuracy is dropping. There are several reasons for this. First of all one gets data from a documentary system (e.g. an OPAC, secondary or primary electronic database or digital library) but, since there is in general no clear definition of the topics due to lack of standards (see Glänzel (1996), Rousseau (2002)) one is not completely sure of what one gets. In addition an electronic system may suffer from system breakdown in which case one is obliged to make unexact interpolations. Data of electronic services and activities through the web (many data are) are also of a different nature than data gathered directly from a computer system. An example is connect time versus times of connection. When entering directly or via telephone lines into a computer system (e.g. an OPAC or the DIALOG system) one is able to report on the connect time. When using a documentary system via the web one cannot report on connect time anymore but only on number of connections (cf. the well-known DIALOG units). Networks such as the web typically have connections between the sites and one talks in this connection about hyperlinks (in-links when a site receives a hyperlink from another site; out-links when a site gives a hyperlink to another site). Their informetric distributions have been studied even in journals such as Nature and Science (see e.g. Albert, Jeong and Barabási (1999), Barabási and Albert (1999) and Huberman, Pirolli, Pitkow and Lukose (1998)) but also in physics journals (see e.g. Barabási, Jeong, Néda, Ravasz, Schubert and Vicsek (2002) and Adamic, Lukose, Puniyani and Huberman (2001)), again showing the interdisciplinary character of nowadays informetrics. Hyperlinks usually are compared with the better known citations but they are very different of nature: hyperlinks cannot be used for aging or author collaboration studies since they are not dated and are usually anonymous. Hyperlinks can be used for determining "authoritative" web sites or documents see CLEVER (1999) which in turn can be used in information retrieval (IR). Also in IR, quantitative methods, e.g. for the evaluation of searches and systems have drastically changed by the way search engines deliver search results: they give the retrieved documents in decreasing order of expected relevance which creates the need for evaluation measures on ordered sets instead of he classical ones (e.g. recall, precision, Jaccard, Cosine, Dice, ...) on ordinary sets (cf. Egghe and Michel (2002, 2003)). It is very important to mention that the fact that most articles are nowadays appearing in electronic journals and/or repositories gives the new possibilities of measuring the use of articles not only by citations or web citations but also by measuring their number of downloads. Downloads can be considered as electronic versions of reading or photocopying of a paper article. The latter indicators were never studied due to the great difficulty of manual datagathering. Hence the study of downloads and their relation with (web) citations is intriguing, see Antelman (2004), Brody and Harnad (2004), Harnad and Brody (2004a,b) and Perneger (2004). It is clear from the above that the extension of informetrics to electronic e.g. web activities gives a boost to the challenge of datagathering and datamanagement and hence to the growth of the field. The need for more publication outlet, which is a consequence from this, is also clearly seen if one looks at the two important informetrics journals JASIST and Scientometrics. JASIST decided in 1998 to increase its publication flow from 12 issues to 14 issues a year. Scientometrics is publishing, from 2005 onwards, 12 issues instead of 9 issues per year. In this connection I want to give a personal advise, which is shared with the informetric colleagues I contacted recently. The increase of publication outlet does also increase the need of refereeing. It is my personal feeling that one should expand the list of possible referees in informetrics to younger informetricians: my workload on refereeing has doubled in 2004, a phenomenon that is recognized by colleague informetricians. Apart from JASIST and Scientometrics, the present journal Information Processing and Management (IPM) is the only journal that regularly publishes papers devoted to informetrics studies, although, in general, IPM is more focused to the subfield of informetrics dealing with quantitative aspects of IR. Elsevier, the publisher of IPM, is interested if a more pronounced general informetrics component is possible in IPM. Hereby one wants to stress that the principal goal is to give an outlet to high quality papers in informetrics. High quality papers are papers that present good mathematical (probabilistic) models and explanations of informetric regularities (in the broad sense) and/or papers in which interesting and important datagathering is presented. The former request (good models and explanations) can be understood in the framework of increasing the degree of "hardness" of the "science" informetrics (cf. Schubert (2002), as mentioned above, there is no evidence that the "hardness degree" has increased recently). The latter request (important datagathering) can be understood in the connection described above: the need for new informetric data coming from electronic environments such as the Internet, so that the regularities in these new media can be understood. Of course, important new data coming from "classical" informetric topics (e.g. cocitations) are also intersting. The papers in this special issue were selected based on these two broad principles. In the next section we will present a brief description of these papers. II. THE PAPERS IN THIS SPECIAL ISSUE Models can be found in five papers. The paper of Burrell, entitled "Symmetry and other transformation features of Lorenz/Leimkuhler representations of informetric data", deals with econometric aspects of informetrics by studying the Lorenz curve. He proves that the Lorenz curve determines the production distribution and examins powers of Lorenz curves. Also self-symmetry aspects of Lorenz curves are studied. Also the paper of Egghe (independently refereed), entitled "Continuous, weighted Lorenz theory and applications to the study of fractional relative impact factors", deals with Lorenz curves. Here, relative impact factors, interpreted in the fractional way, are characterized by the construction of weighted Lorenz curves. Within this model Egghe shows that if, for two situations, one fractional impact factor is larger than the other one, the same is true for all other fractional impact factors and that this result is not true for "classical" impact factors using fixed time periods. The paper of Rousseau, entitled "Conglomerates as a general framework for informetric research", generalizes the well-known "information production processes" (IPPs) by adding the notion of a pool and a magnitude map for item-sets. In this generalization, conglomerates apply to (web) impact factors, Bradford-Lotka type bibliographies, word use, diffusion factors, elections and even bestsellers lists. Generalized Zipf type distributions are studied in the paper of Shan, entitled "On the generalized Zipf's distribution. Part I". General Zipf type distributions are functions that show an approximately linear right tail on a log-log scale. Their characteristics are studied and these are used to describe Zipfian phenomena. The fifth paper on models is the paper of Lafouge and Prime Claverie, entitled " Links between entropy and production of information. Characterization of bibliometric distributions using the effort function". Here production distributions (such as the geometric (exponential) and the power model, i.e. Lotka's function) are characterized by corresponding effort functions and, in each case, the relation with the entropy is given. Four papers deal with the "new" topic of use of and access to electronic articles in a digital library. In the paper of Kurtz, Eichhorn, Accomazzi, Grant, Demleitner, Henneken and Murray, entitled "The effect of use and access on citations", the authors study the possible influence of use and access of articles prior to publication on later citations from the viewpoints: OA (Open Access), EA (Early Access) and SB (Self-selection Bias). Zhao's paper, entitled "Challenges of scholarly publications on the web to the evaluation of science a comparison of author visibility on the web and in print journals", reveals different patterns of scholarly communication on the web and in print journals and promotes the idea of a "two tier" communication and evaluation system, complementing the Web of Science databases. A similar topic is addressed in the paper of Bollen, Van de Sompel, Smith and Luce, entitled "Toward alternative metrics of journal impact. A comparison of usage and citation data". They determine alternative journal impacts based on network centrality measures and conclude that the "classical" impact factors cannot be the sole assessment of journal impact, hence needing again a "two tier" system where also journal impact measures are used, based on usage data. The fourth paper in this subfield is of Nicholas, Huntington, Dobrowolski, Rowlands, Hamid Jamali and Polydoratou and is entitled "Revisiting "obsolescence" and journal article "decay" through usage data: an analysis of digital journal use by year of publication". Hence, as in the two previous papers, usage of articles in a digital library is taken as an alternative of citations but now to determine obsolescence or aging. Collaboration (co-authorhip) is a classical subfield of informetrics. In this special issue we have two papers dealing with this topic but in the environment of web networks or digital libraries. The paper of Liu, Bollen, Nelson and Van de Sompel, entitled "Co-authorship networks in the digital library research community", deals with social network analysis applied on the co-authorship network of past digital library conferences. A variant of PageRank, AuthorRank is introduced and results are compared with other ranking techniques such as ranks based on network centrality measures. Indicators of gender centrality and bibliometric and web indicators of gender cooperation has been executed on the set of multi-authored publications of 64 COLLNET members in the paper of Kretschmer and Aguillo, entitled "New indicators for gender studies in web networks". Further web studies are found in the following two papers. First there is the paper of Payne and Thelwall, entitled "Mathematical models for academic webs: linear relationship or non-linear power law ?". Here one shows, experimentally, that the relation between research of a university and links to the university's web site is a linear one and not a non-linear power law. Bar-Ilan, in the paper entitled "Comparing rankings of search results on the web", rank results of several IR commands are compared in Google, AlltheWeb, Alta Vista and HotBot and one concludes that the employed ranking algorithms are considerably different. There are three papers dealing with the mapping of science. Moya-Anegon,VargasQuesada, Chinchilla-Rodríguez, Corera-Álvarez, Herrero-Solana and MunozFernández have a paper entitled "Domain analysis and information retrieval through the construction of heliocentric maps based on ISI-JCR category cocitation". Based on the JCR Subject Categories and cocitation between them, they construct heliocentric maps of major scientific domains in Spain, France and England and the results are compared. Cocitation is also used in the paper of MarshakovaShaikevich, entitled "Bibliometrics maps of field of science". Using again data from the ISI (Thomson) citation indexes, maps are constructed based on journal cocitation and lexical analysis of keywords in the titles and texts. The same source is used in the paper of Glenisson, Glänzel, Janssens and De Moor, entitled "Combining full-text and bibliometric information in mapping scientific disciplines". This combined methodology of text mining (coword analysis) and bibliometric techniques (cluster analysis) is applied to the papers in the 2003 volume of the journal Scientometrics. Coword analysis is also applied in the last paper in this Special Issue. It is the paper of Onyancha and Ocholla, entitled " An informetric investigation of the relatedness of opportunistic infections to HIV/AIDS". Through the analysis of published articles one can show the disease-gene relationship, i.e. the relatedness of the AIDS-defining diseases in persons with documented HIV infection. Coword analysis is used to calculate the strength of association between the desriptors of the diseases and the gene.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sustainable Disaster Risk Reduction(SDR) for Developing Countries with emphasis on land system Resilience (LSR); Case Study: Rural and Urban Settlement

With increasing frequency, the developing countries and the people living there are being affected by disasters. More and more often, development efforts are being destroyed. The reason for this trend is their growing vulnerability, which in turn is the result of economic and social development processes, such as the expansion of settlements and agricultural land in risk areas. The economic and...

متن کامل

Higher education expansion policy in Iran and its impact on educational justice

After World War II, the need for skilled labors in the economic market and democratic demands provided the setting for higher education expansion in all over the world and this has become a universal phenomenon.  In addition, higher education in Iran that initiated with the justice-seeking goals has experienced significant expansion in the last two decades. Nowadays expansion of higher educatio...

متن کامل

Are clinical measures influenced by various ethnic origins in Iranian patients with ankylosing spondylitis?: A pilot study

Background: Ankylosing spondylitis (AS) may manifest with heterogeneous patterns according to ethnic origins. The objectives of this study were to describe the influence of various Iranian ethnic origins on clinical measures in patients with AS. Methods: 0ne hundred sixty-three AS patients diagnosed by modified New York 1984 criteria were enrolled consecutively in a cross-sectional study. The ...

متن کامل

Mycotoxins in Silages: Occurrence and Prevention

Mycotoxins are an increasingly discussed topic. Several scientific reports have been written which review the effects of these toxic substances on the health and productivity of animals. However, there is a lack of work regarding the incidence of mycotoxins in ensiled material and the consequences of this occurrence. In this review, practical and field information was converged with scientific ...

متن کامل

Higher Education Expansion in Iran and Socio-Cultural Transformation; Agency of Female students

The purpose of this study is to reflection the social agency of female students in the course of socio-cultural transformation in the society, was designed in the interpretiveism paradigm and qualitative approach using phenomenological method. The data collection tool was a deep, unstructured interview. Participants included 60 faculty members, Social activists, and female students from four pr...

متن کامل

Sustainable Disaster Risk (SDR) Reduction for Developing Countries with emphasis on land system Resilience (LSR) on future study approach Case Study: Rural and Urban Settlement

With increasing frequency, the developing countries and the people living there are being affected by disasters. More and more often, development efforts are being destroyed. The reason for this trend is their growing vulnerability, which in turn is the result of economic and social development processes, such as the expansion of settlements and agricultural land in risk areas. The economic a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Process. Manage.

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2005